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Abstract 



We prove, in the universe of trees of bounded height, that for any MSO formula with m variables 
there exists a set of kernels such that the size of each of these kernels can be bounded by an 
elementary function of rn. This yields a faster MSO model checking algorithm for trees of 
bounded height than the one for general trees. From that we obtain, by means of interpretation, 
corresponding results for the classes of graphs of bounded tree-depth (MSO2) and shrub-depth 
(MSOi), and thus we give wide generalizations of Lampis' (ESA 2010) and Ganian's (IPEC 2011) 
results. In the second part of the paper we use this kernel structure to show that FO has the same 
expressive power as MSOi on the graph classes of bounded shrub-depth. This makes bounded 
shrub-depth a good candidate for characterization of the hereditary classes of graphs on which 
FO and MSOi coincide, a problem recently posed by Elberfeld, Grohe, and Tantau (LICS 2012). 
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^ 1 Introduction 

n 

First order (FO) and monadic second-order (MSO) logics play undoubtedly crucial role 
in computer science. Besides traditional tight relations to finite automata and regular 
languages, this is also witnessed by their frequent occurence in the so called algorithmic 
metatheorems which have gained increasing popularity in the past few years. The term 
algorithmic metatheorem commonly refers to a general algorithmic toolbox ready to be ap- 
plied onto a wide range of problems in specific situations, and MSO or FO logic is often 
used in the expression of this "range of problems". 

One of the perhaps most celebrated algorithmic metatheorems (and the original mo- 
t^j- tivation for our research) is Courcelle's theorem [3] stating that every graph property (f> 

expressible in the MSO2 logic of graphs (allowing for both vertex and edge set quantifiers) 

(N 

can be decided in linear FPT time on graphs of bounded tree-width. Courcelle, Makowsky, 
and Rotics [I] then have analogously addressed a wider class of graphs, namely those of 
bounded clique-width, at the expense of restricting <f> to MSOi logic (i.e., with only vertex 
set quantification) . Among other recent works on algorithmic metatheorems we just briefly 
mention two survey articles by Kreutzer |15j and by Grohe-Kreutzer |14j . and an interesting 
recent advance by Dvorak, Krai', and Thomas [7] showing linear-time FPT decidability of 
FO model checking on the graphs of "bounded expansion". 

Returning back to Courcelle's theorem [3] and closely related [TJH], it is worth to remark 
that a solution can be obtained via interpretation of the respective graph problem into an 
MSO formula over coloured trees (which relates the topic all the way back to Rabin's S2S 
theorem |20|). However, a drawback of these metatheorems is that, when their runtime is 

width{G) *| 

2 U 

expressed as 0[f(<p,width(G)) ■ \G\), this function / grows asymptotically as 2 > 
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where the height a depends on <p, precisely on the quantifier alternation depth of cj) (i.e., / is 
a non- elementary function of the parameter </>). The latter is not surprising since Frick and 
Grohe [TTJ [TU] proved that it is not possible to avoid a non-elementary tower of exponents 
even in deciding MSO properties on all trees or coloured paths (unless P=NP). 

Given the importance of Courcelle's and other related algorithmic metatheorems, it is 
a bit of surprise that aparently no research paper tackled this "nonelementary exponential 
tower" issue of deciding graph MSO properties until recently: The first step in this direction 
occurred in a 2010 ESA paper by Lampis [IB], giving an FPT algorithm for MSO2 model 
checking on graphs of bounded vertex cover with only a double-exponential parameter de- 
pendence. Ganian [T^] then analogously addressed MSOi model checking problem on graphs 
of bounded so-called twin-cover (much restricting bounded clique- width) . 



MSO on trees of bounded height 

Frick-Grohe's negative result leaves main room for possible improvement on suitably restric- 
ted subclass(es) of all coloured trees, namely on those avoiding long paths. In this respect, 



our first result here (Theorem 3.3 and Corollary 3.4) gives a new algorithm for deciding 
MSO properties 4> of rooted m-coloured trees T of fixed height d. This algorithm uses so 
called kcrnelization — which means it efficiently reduces the input tree into an equivalent one 
of elementarily bounded size, leading to an FPT algorithm with runtime 

0(\V(T)\) + 2 2 J . 

Informally, our algorithm "trades" quantifier alternation of <f> for bounded height of the 
tree. Hence there is nothing interesting brought for all trees, while on the other hand, our 
algorithm presents an improvement over the previous on the trees of height < d for every 
fixed value d. We refer to Section [3~T| for details and exact expression of runtime. 

In a more general perspective, our algorithm can be straightforwardly applied to any 
suitable "depth-structured" graph class via efficient interpretability of logic theories. This 
includes the aforementioned results of Lampis [IB] and Ganian [12] as special cases. We 



moreover extend the algorithm (Theorem 3.5) to testing MSO2 properties on all graphs of 



tree-depth < d (sec Definition 2.1l in elementary FPT, covering a much wider graph class 



than that of bounded vertex cover. This in Section [372] concludes the first half of our paper 



Expressive power of FO and MSO 

Secondly, the existence of an (elementarily-sized) kernel for MSO properties <j> of trees of 



fixed height d (Theorem 3.3 1 is interesting on its own. Particularly, it immediately implies 



that any such MSO sentence <j) can be cquivalcntly expressed in FO on the trees of height d 



(simply testing the finitely many bounded-size kernels for which (j) is true, Theorem 4.1|. 
This brings us to the very recent paper of Elbcrfeld, Grohe, and Tantau [9 j who proved that 
FO and MSO2 have equal expressive power on the graphs of bounded tree-depth. Their 
approach is different and uses a constructive extension of Feferman-Vaught theorem for 
unbounded partitions. We can now similarly derive the result from Theorem |3.3[ as in the 
tree case. 

Going a step further, we actually half-answer the main open question posted in [5]; 
what characterizes the hereditary graph classes on which the expressive powers of FO and 



MSOi coincide? We use Theorem 3.3 and the new notions of .13J to prove that FO and 



MSOi coincide (Theorem 4.3 1 on all graph classes of bounded so called shrub-depth (see 
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Figure 1 The path of length 14 has tree-depth 3 + 1 = 4 since it is contained in the closure of 
the depicted (red) tree of height 3. It can be proved that this is optimal. 



Definition 2.3 1. Unfortunately, due to lack of a suitable "forbidden substructure" character- 
ization of shrub-depth, we are not yet able to prove the converse direction, but we conjecture 
that a hereditary class on which FO and MSOi coincide must have bounded shrub-depth 
(Conjecture 4.4 1. This conjecture is also supported by the following claim in [T3]; a graph 
class C has an MSOi interpretation in the class of coloured trees of height < d iff C is of 
shrub-depth < d. 



2 Preliminaries 

We assume standard terminology and notation of graph theory, see e.g. Diestel [5]. Due to 
limited space, we refer there [5] for the standard definition of tree-width tw(G). 

For an introduction to parameterized complexity we suggest Now we just recall that 
a problem V with an input (x, k) £ E* x N is fixed parameter tractable, or FPT, if it admits 
an algorithm in time C(/(fc) • Ixl *- 1 -*) where / is an arbitrary computable function. It is 
known that V is in FPT if, and only if, it has a kernel, i.e., every instance (x, k) can be in 
polynomial time transformed to an equivalent instance (x',k'} such that (x, k) G V 
(x', k') £ V and |(x', k') \ < g(k) for some computable g. 



Measuring depth of graphs 

Our paper deals with some not-so-known decompositions of graphs, too. The first one is 
related to tree-decompositions of low depth. 

► Definition 2.1 (Tree-depth [17]). The closure cl(F) of a rooted forest F is the graph 
obtained from F by adding from each node all edges to its descendants. The tree-depth 
td(G) of a graph G is one more than the smallest height (distance from the root to all 
leaves) of a rooted forest F such that G C cl(F). 

Note that tree-depth is always an upper bound for tree-width. Some useful properties 
of it can be derived from the following asymptotic characterization: If L is the length of a 
longest path in a graph G, then |~log 2 (£ + 2)] < td(G) < L + 1. See Figure [l] For a simple 
proof of this, as well as for a more extensive study of tree-depth, we refer the reader to (T5] 
Chapter 6]. Particularly, it follows that td(G) can be approximated up to an exponential 
error by a depth-first search, and furthermore computed exactly in linear FPT using the 
tree- width algorithm of Bodlaender [5] . 

Besides tree-width, another useful width measure of graphs is clique-width; defined for a 
graph G as the smallest number of labels k — cw(G) such that G can be constructed using 
operations to create a new vertex with label i, take the disjoint union of two labeled graphs, 
add all edges between vertices of label i and label j, and relabel all vertices with label i to 
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Figure 2 The graph obtained from ^3,3 by subdividing a matching belongs to 7TM3(2). The 
respective tree model is depicted on the right. 



have label j. Similarly as tree-depth is related to tree- width, there exists a very new notion 
of shrub-depth 13] which is (in a sense) related to clique-width, and which we explain next. 

► Definition 2.2 (Tree model [13]). We say that a graph G has a tree model of m colours 
and depth d > 1 if there exists a rooted tree T (of height d) such that 

i. the set of leaves of T is exactly V(G), 

ii. the length of each root-to-leaf path in T is exactly d, 

iii. each leaf of T is assigned one of m colours (this is not a graph colouring, though), 

iv. and the existence of a G-edge between u, v £ V(G) depends solely on the colours of u, v 
and the distance between u, v in T. 

The class of all graphs having a tree model of m colours and depth d is denoted by TM. m (d). 



For instance, K n £ TM\{\) or K n ,n <= TM.2^)- Definition 2.2 is further illustrated in 
Figure [2] It is easy to see that each class TA4 m (d) is closed under complements and induced 
subgraphs, but neither under disjoint unions, nor under subgraphs. One can also routinely 
verify that each class TA4 m (d) is of bounded clique-width. The depth d of a tree model can 
be seen as a generalization of the aforementioned tree-depth parameter, and for that reason 
it is useful to work with a more streamlined notion which only requires a single parameter. 
To this end we introduce the following (and we refer to [T3] for additional details): 

► Definition 2.3 (Shrub-depth [13]). A class of graphs 9 has shrub-depth d if there exists m 
such that 9 C TAi m (d), while for all natural m it is 9 % TM. m (d — 1). 



Note that Definition |2.3| is asymptotic as it makes sense only for infinite graph classes. 
Particularly, classes of shrub-depth 1 are known as the graphs of bounded neighbourhood 
diversity in |16j . i.e., those graph classes on which the twin relation on pairs of vertices (for 
a pair to share the same set of neighbours besides this pair) has a finite index. 



MSO logic on graphs 

Monadic second-order logic (MSO) is an extension of first-order logic (FO) by quantification 
over sets. On the one-sorted adjacency model of graphs it specifically reads as follows: 

► Definition 2.4 (MSOi logic of graphs). The language of MSOi contains the expressions 
built from the following elements: 

variables x, y, . . . for vertices, and X,Y, . . . for sets of vertices, 
h the predicates x £ X and edge(x, y) with the standard meaning, 

equality for variables, the connectives A, V, -1, — >, and the quantifiers V, 3 over vertex and 

vertex-set variables. 
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Note that we do not allow quantification over edges or sets of edges (as edges are not 
elements) in MSOi. If we consider the two-sorted incidence graph model (in which the edges 
formed another sort of elements), then we get: 

► Definition 2.5 (MSO2 logic of graphs). The language of MSO2 contains the expressions 
built from elements of MSOi plus the following: 

m variables e, /, . . . for edges, E,F,... for sets of edges, the respective quantifiers, and 
h the predicates e S F and inc(x,e) with the standard meaning. 

Already MSOi logic is quite powerful as it can express various common hard graph 
properties; e.g., 3-colourability. The expressive power of MSO2 is even strictly larger [5] 
since, for instance, Hamiltonicity has an MSO2 definition (while not MSOi). On the other 
hand, MSO2 and MSOi coincide on the class of trees, or on many other restricted graph 
classes. Hence we will speak only about MSOi on trees, from now on. The large expressive 
power of MSO logics is the reason for their popularity in algorithmic metatheorems. 

The problem to decide, for a sentece tfj in logic L, whether an input structure G satisfies 
G \= ip, is also commonly called the L model checking problem (of ip). Hence, for instance, 
the c-colourability problem for each fixed c is an instance of MSOi model checking; where 
il> = BX l ,...,X c . [(Vs. \f c i=1 xeXi) A (Vx,y. A- = i(* 2 X i v V t X i v ^edge(x, y)))] . 

3 Trees of Bounded Height and MSO 

The primary purpose of this section is to prove Theorem |3.3| that for any m-coloured tree T 
of constant height h there exists an efficiently computable subtree T C T such that, for any 
MSOi sentence <f> of fixed quantifier rank r, it is T |= cf> <=> T \= <f>, and the size of To is 
bounded by an elementary function of r and m (the dependence on h being non-elementary, 
though). Particularly, since checking of an MSOi property <fi ca,n be easily solved in time 
0*(2 C I^I) on a graph with c vertices (in this case T ) by recursive exhaustive expansion of 
all quantifiers of 0, this gives a kernelization-based elementary FPT algorithm for MSOi 



model checking of rooted m-coloured trees of constant height h (Corollary 3.4). 

We need a bit more formal notation. The height h of a rooted tree T is the farthest 
distance from its root, and a node is at the level I if its distance from the root is h — £. For 
a node v of a rooted tree T, we call a limb of v a subtree of T rooted at some child node of 
v. Our rooted trees are unordered, and they "grow top-down", i.e. we depict the root on the 
top. For this section we also switch from considering m-coloured trees to more convenient 
i-labelled ones, the difference being that one vertex may have several labels at once (and so 
m ~ 2*). MSOi logic is naturally extended to labelled graphs by adding unary predicates 
L(x) for every label L. We say that two such rooted labelled trees are l-isomorphic if there 
is an isomorphism between them preserving the root and all labels. 



3.1 The Reduction Lemma 

Concretely, we preprocess a given tree T into a bounded kernel Tq C T by recursively 
deleting from T all limbs which are "repeating (being l-isomorphic) too many times". This 
is formalized in Lemma |3.1| To describe the exact reduction of T to To, we need to define 
the following recursive "threshold" values, for i — 0, 1, 2, . . . : 

Ri(q,s,k) = q- Ni(q,s,k) s , where (1) 

N (q,s,k) = 2 k + l>2 and 
N l+1 (q,s,k) = 2^(i? l ( g ,.s,fc) + l)^ ( ^ ) <2 fc .(2 (? .iV l ( (Z , S ,fcr)^ ( ^ fc) (2) 



6 



Faster Deciding MSO Properties of Trees of Fixed Height, and Some Consequences 



For clarity, we informally in advance outline the intended meaning of these values Ri and 
Ni. We say a labelled rooted tree of height i is (q, s, fc) -reduced if, at any level j, < j < i, 
each node of T has at most ([!]) Rj-x{q, s, fc) pairwise 1-isomorphic limbs (which are of height 

< j — 1). The value Q Nj(q, s, k) is then an upper bound on the number of all possible 
non-l-isomorphic rooted fc-labelled trees T of height < j that are (q, s, fc)-reduced. Note that 
No{q, s, fc) accounts for all distinct fc-labelled single-node trees and the empty tree. 

Assume now any MSOi sentence (closed formula) <fi with q element variables and s set 
variables, and height i. Then, provided a,b > Ri(q, s, k) where k = t + 3q + s, we show that 
the sentence <f> could not distinguish between a disjoint copies and b disjoint copies of any 
(q, s, fc)-reduced rooted t-labelled tree of height i. Altogether formally: 

► Lemma 3.1. Let T be a rooted t-labelled tree of height h, and let cj) be an MSOi sentence 
with q element quantifiers and s set quantifiers. Suppose that u € V(T) is a node at level 
i + l where i < h. 

a) If among all the limbs of u in T , there are more than Ri(q, s, t + 3q + s) pairwise 
l-isomorphic ones, then let T' C T be obtained by deleting one of the latter limbs from T. 
Then, T \= (j) T' \= (j). 

b) Consequently, there exists a rooted t-labelled tree Tq C T such that Tq is 
(<7, s, t + 3q + s) -reduced, and T \= <j) Tq \= <f>. 

In the case of FO logic, a statement analogous to Lemma [3.1| is obtained using folklore 
arguments of finite model theory (even full recursive expansion of all q vertex quantifiers in <j> 
could "hit" only bounded number of limbs of u and the rest would not matter). However, in 
the case of MSO logic there are additional nontrivial complications which require new ideas 
(in addition to standard tools) in the proof. Briefly saying, one has to recursively consider 
the internal structure of the limbs of u, and show that even an expansion of a vertex-set 
quantifier in <j) does not effectively distinguish too many of them (and hence some of them 
remain irrelevant for the decision whether T \= <j)). 

Before proceeding with formal proof of Lemma |3.1| we need to justify the intended 
meaning of the values Nf. 

► Lemma 3.2. For any natural i,q,s, and k, there are at most Ni(q,s,k) pairwise non-l- 
isomorphic (q, s, fc) -reduced rooted k-labelled trees of height < i. 

Proof. This claim readily follows from (|T|) and ^ by induction on i. The base case i = 
is trivial, and the count includes also the empty tree. A rooted fc-labelled tree T of height 

< i + l can be described by a labelling of its root r (2 k possibilities), and a set of its limbs, 
each one of height < i. This set of limbs can be fully described by the numbers of limbs 
(between and Ri{q, s,k)) in every of < Ni(q,s,k) possible 1-isomorphism classes. Hence 
by Q we have got at most N i+ i(q, s, fc) possible distinct descriptions of T. -4 

Proof of Lemma 13.11 Note first that part b) readily follows by a recursive bottom-up ap- 
plication of a) to the whole tree. Hence we focus on a), and sketch our proof as follows: 

(I) We are going to use a so called "quantifier elimination" approach^ That means, 
assuming T \= <j> <t^=> T' \= <j), we look at the "distinguishing choice" of the first 
quantifier in cj>, and encode it in the labeling of T (e.g., when (f> = 3x.i/j, we give new 
exclusive labels to the value of x and to its parent/children in T and T"). By an 



1 This approach has been inspired by recent [7], though here it is applied in a wider setting of MSO logic. 
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inductive assumption, we then argue that the shorter formula ip cannot distinguish 
between these newly labeled T and T', which is a contradiction. 
(II) The traditional quantifier elimination approach — namely of set quantifiers in <p, how- 
ever, might not be directly applicable to even very many pairwise 1- isomorphic limbs 
in T if their size is unbounded. Roughly explaining, the problem is that a single valu- 
ation of a set variable on these repeated limbs may potentially pairwise distinguish 
all of them. Hence additional combinatorial arguments are necessary to bound the 
size of the limbs in consideration. 
(Ill) Having successfully resolved technical |H|, the rest of the proof is a careful composition 
of inductive arguments using the formula ([l]) Ri(q, s,k) = q ■ Ni(q, s, k) s . 

The whole proof goes through by means of contradiction. That is, we assume T \= <j> 



while T" |= -^<fi (a counterexample to Lemma 3.1 a, where T' implicitly depends on the choice 



of u), up to natural symmetry between (j) and ^0 in this context. Let t' = t + 3q + s. Let 
Bx, . . . , Bp C T where p > Ri(q, s, t') > 1 be the pairwise 1-isomorphic limbs of u in T, as 



anticipated in Lemma 3.1a). So, say, T' = T — V(B{). We will apply nested induction, 
primarily targeting the structure of the sentence </>, or simply the value q + s. For that we 
assume <p m the prenex form, i.e., with a leading section of all quantifiers. If q = s = 0, 
then <f) is a propositional formula which evaluates to true or false without respect to T or 
T' . Hence we further assume q + s > 0. Note also the little trick with choice of t' which 
"makes room" for (|l]) adding further labels to T in the course of the proof. 

(Minimality setup) To overcome the complication in pi}, we have to deal with limbs 



Bi, ... , B p of bounded size. So, among all the assumed counterexamples to Lemma 3.1 a) 
for this particular <fi or symmetric -><fi, choose one (meaning precisely the choice of T and u 
within it) which minimizes the size of B\ (same as the sizes of B 2l ■ • • , B p ). This minimality 
choice actually represents a secondary induction in our proof. 

We would like to show that the 1-isomorphic limbs B\, . . . , B p are (g, s, i')-reduced. Sup- 
pose not, and let wt £ V(Bk) be a node at level j + 1 such that among all the limbs of 
Wk in Bk there are more than Rj(q,s,t') pairwise 1-isomorphic ones, hereafter denoted by 
Dk,i, . . . , Dk, r where r > Rj(q, s, t). This choice is made for all k = 1, . . . ,p symmetrically, 
i.e., all the subtrees B^ — B^ — V(Df l i) where k = 1, . . . ,p are pairwise 1-isomorphic, too. 

(Reduction phase) We define a sequence of trees by Uq = T and Uk = Uk-i — V(Dk.i) for 
k = 1, . . . ,p. Recall that Uo |= (j). If it ever happened that Uk-i H <fi but Uk \= ~^4>, then 
we would consider Uk-i and Wk in place of T and u above, and hence contradict the choice 
minimizing B\ (which would be replaced with smaller Dk t \). We may thus say that U p \= <f>. 
We similarly define U[ = T and U' k = U' k _ x - V(D kA ) for k=2,...,p (recall that B x has 
been removed from T'). With an analogous argument we conclude that U' p \= -<<f). 

Note that, now, _Bj~, . . . ,B~ are pairwise 1-isomorphic limbs of u in U p , and they are 
strictly smaller than B\. Since U' p = U p — V(B~^), we may have chosen U p and u in place of 
T, u, again contradicting minimality of B\ in the choice above. Indeed, the (original) limbs 
Bi, . . . , Bp are (q, s, i')-reduced in T. 

(Quantifier elimination: 3x) As the main induction step we now "eliminate" the leading 
quantifier of (f> as follows. Suppose first that (f> = 3x. ip. Let a S V(T) be such that 
T[x — a] \= i/j(x). Clearly, it can be chosen a $ V(Bi) since B\ is 1-isomorphic to other 
B 2 ,...,B p . On the other hand, T'[x = b]tf= ip(x) for all b £ V(T). 

We define a (t + 3)-labellcd tree T a which results from T by adding a new label L x 
exclusively to the node a, a new label L px exclusively to the parent node of a, and L cx 
to the child nodes of a. A tree T a ' = T a — V(Bi) is formed analogously from T". Then 
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we translate the formula ip( x ) with free x into a closed one ip x as defined next: All label 
predicates L(x) in ip( x ) are simply evaluated as L(a) over T. Any predicate x — y is replaced 
with L x (y). Finally, all predicates for edges (x,y) and (y,x) in this parent-child order are 
replaced with L cx (y) and L px (y), respectively. It is trivial that T[x = a] \= i/j(x) 
T a \= ijj x , and T'[x = a] ^ ip{x) T a ' ^ iff. 

All the limbs Bi B p remain pairwise 1-isomorphic in T" unless, say, a G Y{B p ). 

Even in the latter case we anyway obtain, using ([!]), at least p — 1 > Ri(q,s,t') — 1 = 
q ■ Ni(q, s, t') s — 1 > (q— 1) ■ N^q, s, £') s > — 1, s, t') pairwise 1-isomorphic limbs of u in 
T a , including B\. Note also that q — 1 is the number of element quantifiers in rp, and that 
the combined parameter t + 3 + Z(q — 1) + s = t + 3q + s = t' remains the same. Hence 
we can apply the inductive assumption to T a ,u and tp x — concluding that T a |= ip x <=> 
rpai |_ ^ a con t r adiction. 

( Quantifier elimination: 3X ) We are finally getting to the heart of the proof. Suppose 
now that cf> = 3X. ip. Let A C V(T) be such that T[X = A] \= ip(X). On the other hand, 
T'[X = A'} ft ip(X) for all A' C V{T). We define a (t + l)-labelled tree T A which results 
from T by adding a new label Lx precisely to all members of A. Then we translate the 
formula i[>(X) with free X into a closed one ip x by replacing every occurrence of y £ X with 
L x (y). Trivially, T[X = A] \= ip(X) ^ T A \= ?p x . 

Note again that s — 1 is the number of set quantifiers in ip } and that the combined 
parameter t + 1 + 3q + (s — 1) = t + 3q + s = t' remains the same. A key observation is 
that "casting" the new label Lx onto the limbs Bi, . . . , B p may create at most Ni(q, s, t') 
1-isomorphism classes among them. This is simply because, for each k — the 
corresponding B A carries t + 1 < t' labels, it is of height i and (q,s, t')-reduced again. 
Hence, altogether, there are at most Ni(q,s,t') pairwise non-l-isomorphic choices for such 
B A by Lemma 



3.2 



So, among all B\, . . . , B p , there are at least p/Ni(q,s,t') pairwise 1-isomorphic limbs, 
and using Q, p/N z (q,s,t') > Ri(q, s,t)/Ni(q, s,t') = q ■ Ni(q, s,i') s_1 > Ri(q,s- l,t'). 
For simplicity, let the latter limbs be B\, . . . , B p > where p > p' > Ri(q,s — 1,1/). Now we 
apply the inductive assumption to T A ,u and ip x . Up to symmetry between the limbs, we 
get (T A )' = T A - V(Bi) such that T A \= ^ x <^ {T A )' \= ^ x . Now we can define 
A' C V(T') as the set of those nodes having label L x in (T A )' , and hence (IP 4 )' |= ip x 
<=^> T'[X = A'] |= ijj(X), a contradiction to the initial assumption. 

( Quantifier elimination: V ) Finally, the cases of universal quantifiers in <f> are solved ana- 
logously (-d in place of V). < 



3.2 Algorithmic applications 

With some calculus, we summarize the obtained result from an algorithmic point of view. 
Let exp( l \x) be the i-fold exponential function defined inductively as follows: exp^{x) = x 
and exp( l+1 \x) = 2 expi) ( x \ Note that exp^ h \x) is an elementary function of x for each 
particular height h. For a rooted i-labelled tree T of height < h, we call the uniquely- 



determined maximal (q, s, A:)-reduced tree To C T from Lemma 3.1 b), where k = t + 3q + s, 
a (g, s, k)-reduction of the tree T. Then we routinely get: 

► Theorem 3.3. Let t, h > 1 be integers, and let <fi be an MSOi sentence with q element 
quantifiers and s set quantifiers. For each rooted t-labelled tree T of height h, the tree Tq C T 
which is a {q, s, t + 3q + s) -reduction of T and Tq \= <j) T |= (j), can be computed in 
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linear time (non-parameterized) from T. Moreover, its size is bounded by 

\V(T )\ < exp^ [(2 h+5 - 12) • (f + q + s)(q + a)] . 
Proof. We first use a natural adaptation of the classical linear-time tree- isomorphism al- 



gorithm to construct the tree To- Then we note that by Lemma 3.1 b) the tree To is (q, s, k)- 
reduced where k = t + 3q + s. Hence we can consider To C Uh, q , s ,t where Uh, q , s ,t is the 
"maximal" (q, s, /c)-reduced rooted t- labelled tree of height h, i.e., that one which contains 
(at each level j) precisely Rj-i(q, s, k) limbs of each 1-isomorphism class (for heights < j — l). 



By Lemma 3.2 the number of descendants at each level j of Uh, q , s ,t is at most 



Rj-i(q, s, k) ■ Nj-±(q, s, k). So, the total number of vertices in Uh. q . s ,t is at most 

1 + R h -i(q, s, k) ■ N h -i(q, s, k) ■ (l + R h - 2 (q, s, k) ■ N h ^ 2 (q, s, k) ■ (1 + ...)) < 



h-l 



(3) 



i=0 



The task is now to estimate, by induction on i, the value 1 + Ri(q, s, k) ■ Ni(q, s, k) from 
above by exp^ 1 ^ [(6 • 2 i - 2)k(q + s)] . Note that k>q + s>l. 

1 + N (q,s,k)-R (q,s,k) = I + q ■ (2 k + l) S+1 < 2 q ■ 2 (k+1) ^ < 

< 2 2fe ( s + 1 )+9 < 2 2fc ( s + 1 + < ?) < 2 4fc ( s +9) 



1 + N i+ x(q, s, k) ■ Ri+i(q, s, k) 



< 
< 



< 
< 



l + q-N l+1 (q,s,ky +1 = 

k It, t„ „ l\ , -,\Ni(q,s,k) 



= l+Q 

< l + q 

< l + q 

< l + q 

< 1 + q 

< exp^ 



2 k -(Ri{q,s,k) + l) 



s+l 



< 



2 k - (2Ri(q,s,k)) 
2 k - (2qNi(q,s,k) 



Ni( q ,s,k) 



< 



s ^Ni( q ,s,k) 



< 



{Ni(q,s,k) 



k+q+s\Ni(q,s,k) 



s+l 



< 



2(k+q+s)-N z ( q ,s,kf 



< 



q+(k + q + s)(s + 1) • Ni(q, s, k)' 



< 



exp 



(i) 



2 2(k+q+a) . jVi( g ,fl,fc) 5 



< 



exp 



(i) 



< 



< exp 



(2) 



exp 



(2) 



exp 



(2) 



exp 



(2) 



exp 



(i+2) 



2^(k+ q +s) _ 2 2 exp (i) ((6-2 i -2)k( q +s)) 

2 exp [i) ((6 • 2 i - 2)k{q + a)) +2(k + q + s) 
exp {i) (2 • (6 • 2 l - 2)k(q + a) + (k + q + a)) 
exp {i) (2 • (6 • 2 l - 2)k(q + a) + 2k(q + a)) 
exp^ ((6-2 l+1 -2)%+s)) 
[(6-2 l+1 -2)k(q + s)] 



< 
< 
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ip € "/Ci over J^" j ■0 / G "£ 2 over 

ffeJf > Gel 

G 1 = H I G 

(s.t. G 7 h ^) (s-t. G \= ip 1 ) 



Figure 3 A basic informal scheme of an interpretation of Tii£ 1 (J^) into Thc 2 (^). 



h-l 

< J] (l + Ri(q,8,k)-Ni(q,8,k)) < 

i=0 
h-l 

< Y[ exp {l+1) [(6-2 l -2)fc(q+s)] < 

< exp {h) [2- (6-2' 1 - 1 -2)k(q + s)] < 

< exp^ [(2 h+3 -4)-(t + 3q + s)(q + a)] < 

< exp^ [(2 h+5 - 12) • (t + q + s){q + a)] . 

► Corollary 3.4. Let T be a rooted t-labelled tree of constant height h > 1, and let <p> be an 
MSOi sentence with r quantifiers. Then T \= (p can be decided by an FPT algorithm in time 

ofexp^ [2 h+5 -r{t+r)] +|V(T)|) = O (exp^+V Qtf) + |V(T)|) . 

Proof. We (by brute force) exhaustively expand all the quantifiers of <f> into all possible 
valuations in the reduction To, having at most 2' v ( To ^ possibilities for each. By searching 
this "full valuation tree" in time 0(2^ v ^ T °"'^ r we decide whether To |= 4>- Using the size 
bound on To given by Theorem |3. 3 [ where r = q + s, it is 

2l^( T o)K?+s+l) < 2 exp ' h) [(2 h + 5 ~12)-(t+q+s){q+s)]-(q+s+l) < 

< exp (h+ ^ [(2 h+5 - 12) • (t + 9 + s)(q + s) + (q + a + 1)] 

< exp( h+1 ^ [2 h+5 ■ r(t + r)] . 

The arguments of Corollary |3.4| can be further extended to suitable classes of general 
graphs via the traditional tool of interpretability of logic theories |19j . This powerful tool, 
however, has rather long formal description, and since we are going to use it only ad hoc in 
some proofs anyway, we provide here only a brief conceptual sketch. Imagine two classes of 
relational structures J^,l and two logical languages Li,&2. We say there is an interpret- 
ation I of the &x theory of J(f into the £2 theory of ^ if (see Figure |3| 

there exist £2 formulas which can "define" the domain and the relations of each structure 

H G Jjf inside a suitable structure G € jtf, formally H ~ G 1 , 

and each formula ip € £1 over Jtf can be accordingly translated into ip 1 £ L2 over ^ 
such that "truth is preserved", i.e., H \= ip iff G |= ip 1 for all such related H, G. 



With ([3]), we then get 
\V(T )\ < \V(U h , q , s , t )\ 



Jakub Gajarsky and Petr Hlineny 



A simple example is an interpretation of the complement of a graph G into G itself 
via defining the edge relation as ^ edge(x,y). A bit more complex example is shown by 
interpreting a line graph L(G) of a graph G inside G; the domain (vertex set) of L(G) being 
interpreted in E(G), and the adjacency relation of L(G) defined by the formula a(e, f) = 
e^/A 3x. inc(x, e) A inc(x, /). This example interprets the MSOi theory of line graphs in 
the MSO2 theory of graphs. 

We now return back to the promissed extensions. Since the MSO2 theory of graphs of 
tree-depth < d has an interpretation in coloured trees of depth < d+1 (a graph G is actually 
interpreted in W such that G C cl(W), with labels determining which "back edges" of W 



belong to G), we get the following generalization of Lampis' [TB] from Corollary 3.4 MSO2 
model checking can be done in FPT time which depends elementarily on the checked formula, 
not only for graphs of bounded vertex cover, but also for those of bounded tree-depth. 

► Theorem 3.5. Let S>d denote the class of all graphs of tree -depth < d, and <f> be an MSO2 
sentence with r quantifiers. Then the problem of deciding G \= <p for G £ S>d has an FPT 
algorithm with runtime 0( exp {d+2) (2 3d+7 ■ r 2 ) + \V(G)\). 

We also remark on an important aspect of FPT algorithms using width parameters — 
how to obtain the associated decomposition of the input (here of G £ &d)- In the particular 
case of tree-depth, the answer is rather easy since one can use the linear FPT algorithm for 
tree-decomposition [2 to compute it (while, say, for clique- width this is an open problem) . 

Before proving Theorem |3.5[ we state the aforementioned interpretation lemma for graphs 
of tree-depth < d. 

► Lemma 3.6. Let d be an integer and 2%d denote the class of all rooted d-labelled trees of 
height d. For every MSO2 sentence <f> and every d, there exists an efficiently computable 
MSOi sentence 4>\ over &d+i such that the following holds: For each G £ S> d there is 
Tq £ ^d+i — which is obtained as a (d + 1) -labeling of the forest W , G C cl(W) certifying 
td(G) < d —such that T l G ~ G \= cf> ^=> T G \= 4>\. 

Lf q, s,q' , s' in this order denote the numbers of vertex, vertex-set, edge, edge-set quanti- 
fiers in cf>, then <\> l d has q + s + (d + l)q' + ds' quantifiers and size 0(d\cf>\). 

Proof. Let G £ S>d and W be a rooted forest of height d such that G C cl(W), and let 
T = Tq £ £$d+i be obtained from W by adding a new common root of special label L . We 
are going to interpret G in a suitable (d+ l)-labelling of T, identically mapping V(G) into 
V(T). In particular, each vertex quantifier 3x . . . in (f> is simply replaced with 3x.^Lq(x) A. . . 
and nothing is changed with vertex-set quantifiers. 

We partition the edges of G into E(G) = E\ U • • • U Ed such that e = uv £ Ej iff the 
ends u,v are at the levels i, i' in W and \i — i'\ = j. For a node x £ V(W), we assign x a 
label Lj, j > 1, iff there is an ancestor y of x such that xy £ Ej. Each edge / is interpreted 
in a [d + l)-tuple Uo, u\, . . . ,Ud such that / = ugUd where uq is an ancestor of Ud in T, 
the sequence uq = ■ ■ ■ = Ud-j, Ud—j+i, ■ ■ ■ ,Ud m this order forms the vertices of the unique 
UQ-Ud path in W, and f £ Ej. This property of uq, ... ,Ud can be routinely described by a 
propositional formula ad = Aj=i { u i-i — u i Vparetnt(iXj_i, Uj)) A Aj=2 { u i-i = u i u i-2 — 
Ui-i) A \/ d =1 (Li(u d ) A« = u d -i j= «d_i+i). Note \a d \ = 0(d). 

Hence each edge quantifier 3/... occuring in <f> is replaced with 3uQ,...,Ud- 
ad(uo, . . . , Ud) A . . . This trivially gives also the vertex-edge incidence relation. 

As for edge-sets F C V(G), this F is interpreted in a <i-tuple of node-sets K\, . . . , Kd C 
V^(VF) such that y £ Kj iff y is the descendant end of some edge xy £ F n Ej. So 3F ... in 
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<f> is replaced with 3K\, . . . , Kd .... Finally, / e F is interpreted as Vj=i ( u d G Kj A«o = 

• ■ • = A Ud-j ^ Ud-j+i). 

To summarize, for every edge quantifier we create d + 1 new quantifiers in 0^, and for 
every edge-set quantifier we create d new set quantifiers . Hence <\)\ has q + s + (d + l)q' + ds' 
quantifiers. It is also routine to verify that G \= <fi Tq |= </>j, as desired. 



Proof of Theorem 13.51 By Lemma 3.6 we need to perform model-checking on a (d + 1)- 
labeled tree Tq of depth d + 1 and size |V(G)| + 1. The formula <^ we need to evaluate has 
q + s + (d + l)q' + ds' < (d + l)(q + s + q' + s') = (d+ l)r quantifiers, where r is the number 
of quantifiers in <fi. By substituting these values into Corollary |3.4| we get 

O (exp {d+ ^ (2 d+6 ■ (d + l)r(d + l)(r + 1)) + \V(G) |) = 

= O (exp {d+2) (2 d+6 ■ (d + l) 2 r(r + 1)) + \V(G)\) = 

= O [exp^ d+2 \2 d+e ■ 2 2d 2r 2 ) + \V(G)\j , 
the desired result. -4 

Concerning MSOi model checking, one can go further. Graphs of neighbourhood diversity 
m (introduced in |16j ) are precisely those having a model in which every vertex receives one 
of m colours, and the existence of an edge between u, v depends solely on the colours of u, v. 
Clearly, these graphs coincide with those having a tree model of m colours and depth 1, and 
so we can give an FPT algorithm for MSOi model checking on them from Corollary |3.4| 
which is an alternative derivation for another result of Lampis [16] . We can similarly derive 
an estimation of the main result of fJJ] (here just one exponential fold worse). 

A common generalization of these particular applications of Corollary 1 3 . 4 1 has been found, 
together with the new notion of shrub-depth, in this subsequent work: 

► Theorem 3.7 (Ganian et al. |13j). Assume d > 1 is a fixed integer. Let S? be any graph 
class of shrub- depth d (Definition Then the problem of deciding G \= <fi for the input 

G S Sf and MSOi sentence <f>, can be solved by an FPT algorithm, the runtime of which 
has an elementary dependence on the parameter <j). This assumes G is given on the input 
alongside with its tree model of depth d. 

4 Expressive power of FO and MSO 



Theorem 3.3 has another interesting corollary in the logic domain. Since the size of the 
reduction T of T is bounded independently of T, the outcome of T \= <j> actually depends 
on a finite number of fixed-size cases, and one can use even FO logic to express (one would 
say by brute force) which of these cases is the correct (q, s,t + 3q + s)-reduction of T. The 
outlined arguments lead to the following conclusions. 



► Theorem 4.1 (Theorem 3.3 1. Let t,h>l be integers, and let <p be an MSOi sentence with 
q element quantifiers and s set quantifiers. There exists a finite set of rooted t-labelled trees 
%h,t,tt> satisfying the following: For any rooted t-labelled tree T of height < h, it holds T (= cf> 
if and only if the (q, s, t + 3q + s) -reduction of T is l-isomorphic to a member of ^h,t,(j>- 

Proof. The set ^h,t,<j> is simply formed by all (q, s, t + 3q + s)-reductions of the t-labelled 
trees T of height < h such that T \= 4>, modulo isomorphism. Since t, h are fixed and the 
members of %h,t,<S> are thus of bounded size, we have finitely many nonisomorphic possibilities 
for those. < 
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With Theorem |4.1 1 we get quite close to the very recent achievement of Elberfeld, Grohe, 
and Tantau [H) who prove that FO and MSO2 have equal expressive power on the graphs of 
bounded tree-depth (and that this condition is also necessary on hereditary graph classes). 
The following weaker statement is actually an easy consequence of our findings, too: 

► Corollary 4.2 (Elberfeld, Grohe, and Tantau |9J). Let h,t be integers, and <j> an MSOi 

sentence. Then there is an FO sentence iph,t,<p such that, for any rooted t-labelled tree T of 
height < h, it is T \= (f> -4=> T |= iph,t,4>- 



Proof. Let ^h,t,<t> be the finite set given by Theorem 4.1 and let red(T) shortly denote the 
(q, s, fc)-reduction of T where k = t + 3q + s. We write 

= 3x. root(x) A t w (x). 

The intended meaning of FO tw is that T |= tw(t) where r € V(T) if, and only if, the 
rooted subtree T r C T induced on r and all its descendants, fulfills red(T r ) ~ W. Assuming 
existence of such tw for a moment, we see that T \= <f> T \= iph,t,<j>- 

We build tw recursively by induction on the height of W. For height zero, i.e. when W is 
a single vertex, tw(%) simply tests the correct label of x and that x has no children. Now let 
W be of height h > 0, with the root w and its limbs Wij where i = 1, . . . , a and j = 1, . . . , bi, 
such that all Wij for j = 1, . . . , bi are 1-isomorphic to [/, = Wi t %, and these J7, for i = 1, . . . , a 
are pairwise nonisomorphic. Let M denote the set of those Ui for which bi = i?; l _i(<7, s, k) 
(note that b t cannot be larger than that by the definition of (g, s, fc)-reduction). 

We can now write 

t w (x) = 3 (y itj : i = 1, ...,a, j = 1, ...,6j). A. . parent(x, Vi,j)A 

A A . . Vi,j ^ Vi',j> A A . nriiVij) A 

A (yz. parent(x, z) -4 (\/. . 2 = V V^g^ r ^( 2; ))) > 

meaning that; (1) among the limbs of x in T there exist pairwise distinct ones in a one-to-one 
1-isomorphism correspondence to the limbs of w in W, and (2) all the other limbs of x in T 
are 1-isomorphic to some Ui which has the maximum allowed repetition Rh-i(q, s, k) (and 



hence Lemma 3.1 a) applies to them). By a routine check of the induction step this means 
red(T x )~WiBT\=Tvy(x). 

The recursive construction of % is finished, and so is the proof. 

It is now a natural question whether and how could our alternative approach to coincidence 
between FO and MSO on graphs be extended in the same direction. 

Indeed, given an MSO2 sentence <f> over ^ (the graphs of tree-depth < d), we can 
interpret this in an MSOi sentence <^ over rooted (d+l)-labelled trees of height < d+l (see 



in Lemma 3.6 1. Then, by Corollary 4.2 we immediately get an FO sentence ad equivalent 
to (j)^. The problem is, however, that ad is a formula over rooted (d + l)-labelled trees, 
and we would like to get an interpretation of ad back in the FO theory of the class @d, 
which does not seem to be an easy task directly. Still, part of the arguments of [3] can be 
combined with the approach of Corollary |4.2| to provide an alternative relatively short proof 
of coincidence between FO and MSOi on classes of bounded tree-depth (thus bypassing the 
Feferman-Vaught-type theorem in [9 ). 

The reason for specifically mentioning Elberfeld, Grohe, and Tantau 's 9 here is actually 
their main posted question — what are the sufficient and necessary conditions for a hereditary 
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graph class to guarantee the same expressive power of FO and MSOi? Using Theorem |4.1| 
and improved ideas based on a proof of Corollary |4.2| we provide a nontrivial sufficient 
condition which we also conjecture to be necessary. 

► Theorem 4.3. Let d be an integer and y be any graph class of shrub- depth d (Defini- 
tion 2.3). Then for every MSOi sentence (f> there is an FO sentence ipd,4> such that, for any 
G G y , it is G \= 4> < ^=^ G ^ tpd.4>- Consequently, FO and MSOi have the same expressive 
power on y . 

► Conjecture 4.4. Consider a hereditary (i.e., closed under induced subgraphs) graph class y. 
If the expressive powers of FO and MSOi are equal on y, then the shrub-depth of y is 
bounded (by a suitable constant). 

The key to proving Theorem |4.3| is the notion of twin sets. Recall that two vertices 
x, y S V{G) are called twins if their neighbour sets in G — x — y coincide. Though the edge 
xy is not specified in this definition, it easily follows that whenever we have a set of pairwise 
twins in G, then those induce a clique or an independent set. 

► Definition 4.5 (Twin sets). Assume X — {xi, . . . ,Xk} and Y — {yi, . . . ,yk] are disjoint 
indexed sets (fc-tuples) of vertices of a graph G. We say that X, Y are twin sets in G if 

the subgraphs of G induced on X and on Y are identical, i.e., X{Xj S E(G) iSyiyj € E(G) 
for all pairs i,j G {1, . . . , fc}, and 

for i = 1, . . . , k, the set of neighbours of Xi in V(G) \ (X U Y) equals that of yi. 

Note that, for simplicity, we consider the twin-sets relation only on disjoint sets, and that 
this relation is generally not transitive. Although we do not need more for this paper, we 
suggest that the notion deserves further extended study elsewhere. 



A tree model (Definition 2.2 1 of a graph G can be, informally, viewed as a complete 



recursive decomposition (of bounded depth) of G into groups of pairwise disjoint pairwise 



twin sets. Roughly, an application of Lemma 3.1a) then says that if (at any level) the 
number of pairwise twin sets in a group is "too large", then one of these sets can be deleted 
from G without affecting validity of a fixed MSOi property on G. Our main task is to 
describe "reducibility" of a large group of twin sets in G using FO (the sets having bounded 
size, though), which is more complicated than in the tree-depth case due to lack of some 
"nice connectivity properties" of a tree-depth decomposition. 

Proof outline (Theorem |4.3[ ). We assume a graph GeY with a tree model T of constant 
depth d, and an MSOi sentence (f>. We informally continue as follows. 

(I) For every fixed d, one can easily interpret cf> in an MSOi formula cf)^ over T, such that 
G h 4> T h ¥ d - 

(II) By Definition |2.2| pairwise 1-isomorphic sibling limbs in T correspond to a group of 
pairwise twin sets in G. Deleting one of these sets from G is equivalent to deleting 



the corresponding limb from T. Hence by (U| and Theorem 4.1 there is a finite set 



^ of graphs (independent of G) such that G \= 4> iff G "reduces" to a member of ^ 



(III) The meaning of "reduction" is analogous to Section 3.1 to a (q, s, fc)-reduced sub- 
tree of the tree model T. The minor technical differences are; (1) we can describe 
the reduction using twin sets, without an explicit reference to whole T, and (2) we 
actually aim at a {q, s, k)' -reduction which means the reduction threshold values are 
Rj(q, s, k) = ma,x{Rj(q, s, k), 2}. (We need to guarantee that at least two twin sets 
of each group remain after the reduction, even in degenerate cases.) 
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(IV) We provide an FO definition of the fact that G reduces to H S modulo some tech- 
nical details. This FO formula qh depends mainly on d and H (actually on a suitable 



tree model of H). The desired sentence ipd,ci> in Theorem 4.3 is then constructed as 
the (finite) disjunction ipdrf = Vire^ 6h- 



Before providing a full proof of Theorem 4.3 we include some needed technical claims 



and details. Notice that instead of m leaf colours in a tree model we consider m-labeled 
rooted trees (where only single labels occur and only in the leaves). 

► Lemma 4.6. Assume a graph G with a tree model T of m labels and constant depth 
d. For any MSOi sentence <j> (over G) there is an m-labeled MSOi formula <£ (over T), 
independent of G, such that G |= <f> T \= tfi^. 



Proof. We interpret <f> into </>^ over T by directly following Definition 2.2 Since V(G) is the 
set of leaves of T, we replace every vertex quantifier Vx . . . occuring in <p with Vx. leai(x) A. . . . 
The core is to interpret the predicates edge(x, y): 

Let If = {1, . . . , m} be the set of colours used in T. Let Jz? c be the set of those ordered 
pairs £ ff 2 such that the tree model T defines edges between vertices of colours i and 
j at distance 2c. We interpret edge(x, y) as 

V idist 2c (x,y)A \/ [L%(x) A Lj(y)) 
c =i,...,d \ (tj')ejsfc 

where dist n routinely expresses that the distance between two vertices is n, by guessing and 
verifying the n — 1 intermediate vertices. < 



Now we give the crucuial technical detail and the related claims which make step (IV I 
working. Assume T is a tree model of a graph G, and B is a limb of a node v in T, such that 
W is the set of leaves of B. We say that a tree model T' is obtained from T by splitting B 
along X C W if a disjoint copy B' of B with the same parent v is added into T, and then B 
is restricted to a rooted Steiner tree of W \ X while B' is restricted to a rooted Steiner tree 
of X' (the corresponding copy of X). A tree model T is splittable if some limb in T can be 
split along some subset X of its leaves, making a tree model T' which represents the same 
graph G as T does. A tree model is unsplittable if it is not splittable. Notice that any tree 
model can be turned into an unsplittable one; simply since the splitting process must end 
eventually. 

► Lemma 4.7. Let H be a graph, and R C H be an induced subgraph having a tree model T 
(of m colours and depth d, but this is not relevant). Let T contain two disjoint l-ismorphic 
limbs B, B' of a node v, and a limb C of a node u. The position of C against B, B' can be 
arbitrary (it may be u = v or even C = B or C — B' ), as long as C is disjoint from one 
ofB.B'. Let W,W C V(R) denote the sets of leaves of B,B' , respectively, and X C V(R) 
denote the set of leaves of C. Assume Y,Z C V(H) \ V(R) are such that W,W',Y are 
pairwise twin sets in H, and that X, Z are also twin sets in H. LfY=/=YnZ=/=$, then the 
tree model T of R is splittable. 

Proof. Let Y = {yi, . . . , yt} and Z — {z\, . . . , z c } such that y{ = Zi for 1 < i < a where 
a < b,c. Let W — {w\, . . . , Wb}, W — {w[, . . . , w' b }, and X = {x\, . . . , x c }, indexed in 
accordance with the assumed twin relations between W, W , Y and between X, Z. 

By the assumptions, say, XDW' — 0. We choose any 1 < i < a and a < j < b, and prove 
that WiWj £ E(H) iff WiWj € E(H): Indeed, say, WiWj £ E(H) implies that yiy^ £ E(H) 
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R 



T 



■ Y rZ :G-R 



Figure 4 A situation which cannot happen, in a graph G with an unsplittable tree model T of 



an induced subgraph R C G, and with the sets W, W' , Y and X, Z as in Lemma 4.7 



by the twin correspondence of W, Y . Then, since X, Z are twin sets (and yi = Zi G Z 
while yj £ Z), we have XiUj S E(H). By the twin correspondence of W and Y, it is then 
XiWj G E(H). Using the twin correspondence of X, Z again, Ziiv'j € E(H). Now, yi = z.- t 
and Y, W are twin sets, and so Wiw'j G E(H) as desired. In the same way, WiWj $ E(H) 
implies WiWj g" E(H). 

Now we have got enough to claim that splitting of B in T along the set W\ = {w\ ,w a } 
creates a tree model Ti of the same graph R. By Definition |2.2| this splitting operation 
affects only the edges from Wi to W\ W\\ and those (in this particular case) will be, in the 
graph modeled by T 1: exactly corresponding to the edges from W\ to W \ {w[, . . . , w' a } in 
R, which have been addressed in the previous paragraph. -4 



Analogously to Section 3.1 we say that a labelled rooted tree of height i is (q,s,k)'- 
reduced if, at any level j, < j < i, each node of T has at most Rj(q,s,k) — 
max{i?j((7, s, k), 2} pairwise 1-isomorphic limbs (i.e., we just ensure the reduction threshold 
is always > 2, see Lemma 4.. 



► Lemma 4.8. Let m, d > 1 and q,s be integers. Assume G G TA4 rn (d) is a graph, and 
R C G is an induced subgraph having an unsplittable tree model T (of m colours and depth 
d). Let afp = (x v : v G V(R)) be a vector of free variables valued in the respective vertices of 
R in G. Then there exists an FO formula qt, depending on d,m,q,s, and T, such that the 
following holds: G |= Pt(^r) if, cind only ifRC-G and there exists a tree model T' D T of 
G of m colours and depth d, such that the (q, s,m + 3q + s)' -reduction of T' is T. 



The importance of Lemma |4.7| in the proof of Lemma |4.8| is, informally, in that one can 
focus just on including every vertex of G — R into some set which is twin (possibly after 
recursive reduction) to suitable limbs of T, while such sets will then never overlap. See 
Figure |4j With Lemma 4.8 at hand, it is then straightforward (though technical and not 
short) to finish the proof of Theorem 4.3 along the aforementioned outline. 



Proof. Let K = {1, 2, . . . , k} be an index set. We start with defining FO predicates express- 
ing that the fc-tuples of vertices (assumed disjoint) which are determined by the valuation 
of x~k — (xi '■ i G K) and of y~K = (yi '■ i € K), induce identical subgraphs (cf. v) and have 
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the same adjacencies to external z (cf. u>). 

v{xK,y~K) = f\ ( edge{x,, Xj) <4 edge{y i ,y j )) 



\fz = XiV\fz = yiV/\( edge(xi, z) «-» edge(y t , z)) 



iei< 



ieK 



By Definition 4.5 ) A V^. oj(x~k, y~K, z)l means these fc-tuples are twin sets. 



We then give the following: 

Qt(xr) = f\ edge(xi,Xj) A f\ -> edge(xi, xj) A Vr. t(x^, r, xr) 



(4) 



The meaning is to verify that the valuation of 5Tr really induces the desired fixed subgraph 
R C G, and that every other vertex r of G "reduces" (in the sense of the tree model T 
associated implicitly with R) to it — see below. 

We adopt a convention that the indices of implicitly encode the tree model T, and if 
B is a limb of T (or B = T) , then we shortly denote by xb the subvector of variables corres- 
ponding to the leaves L{B) of B. We are going to give a recursive definition of t(x~b, r, tjc) 
for Q; roughly meaning (cf. also Corollary 4.2 1 that the vertex r belongs to some implicit 
(and possibly new) limb which reduces into the tree model B. The role of tx is technical — if 
a new limb is used to reduce r, then it has to avoid the vertices in tj< (to be "new"). To 
formulate the recursion more precisely, we write tb{x~b 7 t, t^) in the place of r to indicate 
that t depends on the structure of the tree model B. 

On the base level, if B = {v}, we write 



T B (x B ,r,t K ) = (r = x v ); 



(5) 



i.e., the only possible base reduction is direct equality to the selected vertex. Otherwise, let 
B be of height h > 0, with the root w and its limbs Bij where i = 1, . . . , a and j = 1, . . . , 6j, 
such that all B it j for j = 1, . . . , hi are 1-isomorphic to Bj i, and these B it i for i — 1, . . . , a 
are pairwise nonisomorphic. Let 8% denote the set of those i for which hi = R'u^q, s,k)>2 
(our special reduction threshold, k = m + 3q + s). Then 



T B {x B ,r,t K ) = \f T Bij (x Bij ,r,t K )vy\Tl B (x B ,t K )ATi' B (x B ,r,t K )j, 

i=l,...,a i( z^ 
j = l b t 

where the details follow. We actually need a slightly relaxed(!) notion of twin sets, 

Z(yc,yB,tK) = v(yc,yD) AVz. (Tc{yc,z,t K ) v T D (yB,z,t K )w uj(y£,yB,z) 



(6) 



meaning that y~c and yjj determine twin sets in the graph G minus those vertices which are 
recursively reducible into them, and this is used for ([6]) as follows: 

T' lB (xB-,t K ) ee f\ ^(x B ~~:,x B ~^' k , : Q) (7) 

i#fce{i,...,6»} 

With a shortcut C = (and so yc presents a vector of fresh variables with a structure 
corresponding to that of xb^I and of xb~~~ 2 ), we continue: 



v{xB^,y£) A [ \/ vt = r A f\ y e ^ t k A (8) 

eeL(c) J teL{c), keK 



A £(£s~7,yc, 1 k U yc) A (,(xB~~~ 2 ,y c , t K U y c ) 



(9) 
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We give an informal explanation of the meaning of t b . In (pj, we see the main recursive 
part saying that either r reduces into some of the lower Bij limbs, or that r" happens for r. 
(0 Ti B (xB,r,tic) verifies that 1-isomorphic limbs of B really define twin sets (forgetting 
the vertices which are recursively reducible into them). Then r" B (xB,r,tK) claims that 
there exists an induced subgraph Y of G determined by the variables yc, such that ^ Y 
containing r, is an identical copy of the subgraph induced by Bi i, and Y "avoids" the base 
tree model T. Furthermore, ^ XB~~n XB^n an d yc determine pairs of twin sets in the 
graph G minus those vertices which are recursively reducible into them. 

It remains to prove that qt(xr) nas the desired meaning on G. Assume R has an 
unsplittable tree model T (of m colours and depth d), and G3fi has a tree model T' ~D T 
such that the (q,s,m + 3q + s)'-reduction of T 1 is T. To verify G |= Qt{xr), by Q we 
only need to address validity of Vr. t(xr, r, 5Tr). If this r ^ V(R), then since T is an 
(q, s, m + 3q + s)'-reduction of T", the following holds: There is a limb Mi of T", containing 
r, such that the (q, s, m + 3q+ s)'-reduction of Mi is M , and M has an 1-isomorphic sibling 
limb M in T. Moreover, Mi and M are disjoint in T', and we may assume r E V(Afo). 

In the setting of ([6]), this means (up to symmetry) that M — Bi t i for some i € 8%, while 
B is the unique parent limb of M in T. Hence we may proceed to showing validity of Q 
T-' B (xB,r,tfc): 3y~c is valued in the respective vertices-leaves of M , and rest follows easily 
from the definition of a tree model. This concludes the easier forward direction of the proof. 

Conversely, assume G \= qt{xr). Then, by Q, the subgraph of G induced by xr is 
identical to the graph determined by the tree model T. Our task is to build the tree model 
T" 2 T of whole G, which we do by structural induction on T (wrt. Q) as follows. The 
base case of T being a singleton vertex v is trivial (since there is nothing else to reduce than 
r = x v ). Further on, we use the notation B and B^j as above, and denote by the set 
of all r in G such that T Bi } (£s^> r> ^k) of [6] holds true. By an inductive assumption, there 
exists a tree model B[ ^ D Bi j (of m colours and depth d) of the subgraph Gij C G induced 
on Pij, such that the (q, s, m + 3q + s)'-reduction of B\ • is Bij. 

Let P° = Uij Pi j and B° = B U |J. j B[ y We claim that B° is a tree model of G°, the 
subgraph induced on P° . By Lemma [4.7| we know that possible overlaps between distinct 
B[ j and B' v ^ can only happen in whole identical limbs of them, and so the affected limbs 
can be deleted from one of them. Hence the definition of B° is sound, and the claim follows 
from the definition of a tree model (as applied to B). 

Let, furthermore, P' denote the set of all r in G such that t b (x~b, r, tx) holds true, and 
Q = P' \ P° ^= 0. By (|8|, whole Q is covered by sets Yi,Y 2 , . . . ,Y q corresponding to the 



satisfying choices of yc there. We claim two facts that routinely follow from Lemma 4.7 
and the assumption of unsplittability of T: for all these sets Yj C Q, and Yj n Yfe = for all 
distinct pairs of them. Indeed, say, for the latter one we take the subgraph H induced by 
V(R) U Yj ■ U Yfe, and note that each Yj, Yk is a twin set to two other sets of R determined by 
1-isomorphic limbs in T. Hence if Yj ^ Yj n Yfe ^ 0, then we would get by Lemma 4.7 that 
T was splittable. 

Altogether, for each Yj we can thus separately make a sibling copy of the corresponding 
limb of B (to which Yj reduces), and these together give a tree model B' D B° . It is again 
routine to verify that B' models G' , the subgraph induced on P' . This finishes our induction 
step, and thus the whole lemma when B = T and G' = G. -4 

Proof of Theorem 14.31 We start with an observation that any tree model T of a graph G 
can be turned into an unsplittable one: Indeed, let be the sum of degrees of the nodes 
at distance i from the root, < i < d. Then every splitting operation lexicographically 
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increases the vector (so,s±, 
tree model of G. 

~ it is ,Y C TM m {d). Let a 



Sd) , and so the process must eventually end in an unsplittable 



By Definition 



2.3 



4.6 



and 



> d be the formula from Lemma 
s be the numbers of vertex and set quantifiers of a, and k = m + 3q + s. We simply denote 
by <WL the set of all (q, s, A:)'-reduced unsplittable tree models T of m colours and depth d 
such that T \= a. Then, using Lemma 3.1 we have that G |= (f> if, and only if, G has a tree 



model whose (g, s, fc)'-reduction is 1-isomorphic to a member of 
we give ipd,4> as follows: 



Using Lemma 4.8 



V 3x T .g T (x T ) 



4> 



5 Conclusions 



We briefly recapitulate the two-fold contribution of our primary result; that the MSO model 
checking problem on the universe of coloured trees of bounded height can be reduced to a 
kernel of size bounded by an elementary function of the formula. Firstly, it allows us to 
easily obtain nontrivial extensions of Lampis' and Ganian's result and to fill the gap set by 
Courcelle's theorem and the negative result of Frick and Grohc. 

Secondly, it provides an alternative simple and intuitive way of understanding of why on 
some classes of graphs FO and MSO logics coincide. In this respect, the most important 
property of our kernel is that, after seeing more than a certain number of copies of a certain 
substructure in the input graph, the validity of an MSO formula in question does not change 
any further. While such a behavior is natural for FO properties, it is somehow surprising to 
see it for much wider MSO. This "loss of expressiveness" of MSO (getting down to the FO 
level) is inherited by graph classes of bounded tree-depth and shrub-depth. 



Finally, we briefly discuss why we believe Conjecture 4.4 holds true. It is known [3] that 
each subgraph closed class of graphs such that FO = MSO2 has to have bounded tree-depth. 
Both classes of bounded tree-depth and classes of bounded shrub-depth are interpretable 
in trees of bounded depth, the main difference is how "dense" they are. By allowing "too 
many" edges in graphs of bounded shrub-depth, we basically lost the ability to address edges 
of the interpreted graph in the underlying tree and hence also the ability to quantify over 
these edges and sets of edges (notice that this also means that our class of graphs is no longer 
closed under taking subgraphs, but is still hereditary). Since this is exactly the difference 
between MSOi and MSO2, classes of graphs of bounded shrub-depth are natural candidates 
for exactly those hereditary classes where FO = MSOi. 
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